Examining the Use of Region Web Counts for ESL Error Detection
نویسندگان
چکیده
Significant work is being done to develop NLP systems that can detect writing errors produced by non-native English speakers. A major issue, however, is the lack of available error-annotated training data needed to build statistical models that drive these major systems. As a result, many systems are trained on well-formed text with no modeling of typical errors that non-native speakers produce. To address this issue, we propose a novel method of using geographic region-specific web counts to detect typical errors in the writing of non-native speakers. In this paper we describe the approach, and present an analysis of the issues involved when using web counts.
منابع مشابه
Detection and Modeling of Medium-Scale Travelling Ionospheric Disturbances in Iran Region
Ionosphere layer variations are divided into regular and irregular. Regular changes can be considered as daily changes, changes depending on latitude and changes due to solar activity. Travelling Ionospheric Disturbances (TID) is one of the irregular changes of ionosphere which categorized in small, medium and large scales. Medium-scale Travelling Ionospheric Disturbance (MSTID) which are propa...
متن کاملAnomaly-based Web Attack Detection: The Application of Deep Neural Network Seq2Seq With Attention Mechanism
Today, the use of the Internet and Internet sites has been an integrated part of the people’s lives, and most activities and important data are in the Internet websites. Thus, attempts to intrude into these websites have grown exponentially. Intrusion detection systems (IDS) of web attacks are an approach to protect users. But, these systems are suffering from such drawbacks as low accuracy in ...
متن کاملSearch right and thou shalt find ... Using Web Queries for Learner Error Detection
We investigate the use of web search queries for detecting errors in non-native writing. Distinguishing a correct sequence of words from a sequence with a learner error is a baseline task that any error detection and correction system needs to address. Using a large corpus of error-annotated learner data, we investigate whether web search result counts can be used to distinguish correct from in...
متن کاملUsing Contextual Speller Techniques and Language Modeling for ESL Error Correction
We present a modular system for detection and correction of errors made by nonnative (English as a Second Language = ESL) writers. We focus on two error types: the incorrect use of determiners and the choice of prepositions. We use a decisiontree approach inspired by contextual spelling systems for detection and correction suggestions, and a large language model trained on the Gigaword corpus t...
متن کاملAnalyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کامل